UNICITY CONDITIONS FOR LOW-RANK MATRIX RECOVERY 

Y. C. ELDAR, D. NEEDELL, AND Y. PLAN 



Abstract. Low-rank matrix recovery addresses the problem of recovering an 
unknown low-rank matrix from few linear measurements. Nuclear-norm mini- 
mization is a tractible approach with a recent surge of strong theoretical backing. 
Analagous to the theory of compressed sensing, these results have required ran- 
dom measurements. For example, m > Cnr Gaussian measurements are sufficient 
to recover any rank-r nxn matrix with high probability. In this paper we address 
the theoretical question of how many measurements are needed via any method 
whatsoever — tractible or not. We show that for a family of random measure- 
ment ensembles, m > 4nr — 4r^ measurements are sufficient to guarantee that no 
rank-2r matrix lies in the null space of the measurement operator with probabil- 
ity one. This is a necessary and sufficient condition to ensure uniform recovery of 
all rank-r matrices by rank minimization. Furthermore, this value of m precisely 
matches the dimension of the manifold of all rank-2r matrices. We also prove that 
for a fixed rank-r matrix, m > 2nr — + 1 random measurements are enough to 
guarantee recovery using rank minimization. These results give a benchmark to 
which wc may compare the efficacy of nuclear-norm minimization. 



1. Introduction 

In the compressed sensing problem, one wishes to recover an unknown vector x G M*^ 
from few linear measurements of the form y = Ax G M*" where A is an m x n 
measurement matrix and m <^ n (see e.g. [HI [H E] for tutorials on compressed 
sensing). This problem is clearly ill-posed until additional assumptions are enforced. 
A common assumption is that x is s-sparse: the support of x is small, ||x||o = 
I supp(x)| < s <^ n. If A is injective on all s-sparse vectors then when x is s-sparse, 
X will be the solution to 



(Lo) X = argmin ||w||o such that Aw = y. 

w 

Moreover, for a matrix A to be injective on all s-sparse vectors, we precisely require 
that its null space be disjoint from the set of all 2s-sparse vectors. Since there are 
many classes of matrices satisfying this property with m = 2s rows (see e.g. [SI 
Theorem 1.1]), this shows that only 2s measurements are required to recover all 
s-sparse vectors x G M"^! If we consider the problem of weak recovery, where we 
only wish to recover one fixed vector x, s -|- 1 measurements suffice. These are of 
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course theoretical requirements, as the problem (Lq) is a combinatorial optimization 
problem and is NP-Hard in general (see Sec. 9.2.2 of [TU]). 

Work in the field of compressed sensing has however provided us with numerically 
feasible methods for sparse signal recovery. One such method is £i-minimization 
which is a relaxation of (Lq): 

(Li) X = argmin ||2:||i such that Az = y. 

z 

It has been shown that for certain measurement matrices A, (Lq) and (Li) are equiv- 
alent [TTl 17]. These measurement ensembles can be taken randomly (for example, A 
can be chosen to have Gaussian entries), and require m > 0(s log(ra/s)) measure- 
ments to guarantee reconstruction of all s-sparse vectors. Thus we require slightly 
more measurements (from 2s to Cslog(?7,/s)) but can recover via the problem (Li) 
which is numerically feasible by linear programming methods. For weak recovery 
we need only slightly fewer measurements, see [12] for precise thresholds. 

1.1. Low- Rank Matrix Recovery. A related problem to compressed sensing is 
the problem of low-rank matrix recovery, for which many results have been obtained 
(see e.g. [21SlEl[ini[IHl[I51[ni[23[I31l3])- In this setting, we would like to recover 
a matrix M from few of its linear measurements. The measurement operator is of 
the form A : M"""" M"" and acts on a matrix M by {A{M))i = {Ai, M) where Ai 
are n x n matrices and (■, ■) denotes the usual matrix inner product: 

{A,B) = trace(A*5). 

Given the measurements A{M) e M™, we wish to recover the matrix M G M"^". 
This is of course again ill-posed for small m in general. However, if we operate under 
the assumption that M has low rank then the problem can be made well-posed. The 
question then becomes how large does m need to be in order to guarantee recovery of 
rank-r matrices and how does one recover such a matrix? Analagous to the program 
(Lo), one can consider solving 

(1.1) X = argminrank(X) such that A{X) = A{M). 

X 

This is simply a uniqueness problem; when is M the unique low rank matrix having 
these measurements? However, as in the case of {Lq), the problem fll.ip is intractible 
in general. 

Instead of solving (11. ip . we are often interested in a tractible method which provides 
worst-case guarantees; that is, guarantees which apply to all rank-r matrices whether 
arbitrary or adversarial. A simple observation allows one to select an appropriate 



3 



relaxation of f ll.ip that will do just this. The rank of a matrix is the number of 
non-zero singular values. That is, if a is the vector of singular values of M, then 
rank(M) = ||a||o. Thus a natural relaxation would be to minimize We thus 

consider the minimization problem 

(1.2) X = argmin such that A{X) = A{M), 

X 

where || ■ ||^, denotes the nuclear norm which is defined by 

n 

\\X\\^ = trace(v^X*X)) = J^o-^. 

i=l 

The program (11. 2p can be cast as a semidefinite program (SDP) and is therefore 
numerically feasible. Moreover, it has been shown [20| [25| |2T| |3] that m > Cnr 
measurements suffice to recover any n x n rank-r matrix via (11. 2p . 

A question that does not appear to have been previously addressed is, how many 
measurements suffice to recover rank-r matrices via the more natural (yet intractible) 
method (11.11) ? In the compressed sensing setting, this question was easy to answer 
because the set of s-sparse vectors is the union of a finite number of linear subspaces. 
In the matrix recovery problem, however, this question has remained unresolved. 
Answering this question would not only fill a gap in the literature but also give 
theoretical bounds on the number of measurements required for low-rank matrix 
recovery against which those for problem (II. 2p may be compared. In the case of 
compressed sensing for example, it is clear that to use a tractible method we pay 
in the number of measurements by a factor of log(r2/s). What is this factor in the 
matrix recovery framework? How good is nuclear-norm minimization? These are 
the issues we address in this work. 

In this paper we prove that 4nr — 4r^ measurements are sufficient to recover all 
rank— r n x n matrices using rank minimization almost surely. To recover a fixed 
rank— r nxn matrix with probability one, we show that only 2nr — + 1 measure- 
ments are required. We then compare our results to nuclear norm minimization and 
show that rank minimization requires less measurements, but only by a constant 
factor. 

2. Uniqueness Results 

In this section we provide a detailed summary of our main results. 

We consider random operators A, and first ask that for any rank-r matrix M, 
the solution to (11.11) is X = M with probability one. If this were not the case, 
then there would be some matrices M and M' each with rank-r or less such that 
A{M) = A{M'). This means that the rank-2r (or less) matrix M — M' is in the 
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null space of A. Therefore, to guarantee that fll.ip reconstructs all rank-r matrices, 
a necessary and sufficient condition is that there are no rank-2r (or less) matrices 
in the null space of A. Thus we examine the following subset of M"^": 

(2.1) 7^' = {X G M"^" : rank(X) = 2r}. 

We first wish to compute how large m must be so that the null space of A is disjoint 
from TZ. We will then repeat this argument for smaller values of the rank. 

It is well known that TZ' is a manifold with 4nr — 4r^ dimensions. Is m > Anr — 4r^ 
sufficient to guarantee uniform recovery? We will show that the answer is yes! This 
is summarized by the following theorem. 

Below, we call A a Gaussian operator if each Ai is independent with i.i.d. Gaussian 
entries. 

Theorem 2.1 (Strong Recovery). Let r < n/2. When A : R"^" zs a 

Gaussian operator with m > 4nr — 4r^, problem fll.ip recovers all rank-r matrices 
with probability 1. 



Remarks. 

1. We actually prove a more general result in Theorem 13. 1[ In this result we consider 
any random linear operator A which takes m > d + 1 measurements {Ai, X) where 
{Ai,X) are independent and do not concentrate around zero. Then Theorem 13.11 
shows that any (i- dimensional continuously differentiable manifold over the set of 
real matrices is disjoint (except possibly at the origin) from the null space of A. 
Theorem 12.11 will follow as a consequence. 

2. We consider real- valued matrices but our method can easily be extended to 
complex-valued matrices as well. 

Our proof technique also allows us to provide a bound on the number of measure- 
ments required for weak recovery. Recall that in this framework we are interested in 
recovering one fixed matrix M with high probability. Since M is fixed, we require 
only that for all rank-r matrices X ^ M that X — M is not in the null space of A. 
The set of all rank-r matrices is a manifold of dimension 2?T,r — r^. Recall that in 
compressed sensing for weak recovery we a require number of measurements equal 
to at least one more than the sparsity level. The following result shows that for 
weak recovery of low-rank matrices we require a number of measurements at least 
one more than the dimension of the manifold of all rank-r matrices. 

Theorem 2.2 (Weak Recovery). Fix a rank—r nxn real matrix M. When A : 
j^nxn _j, ^ Gaussian operator with m > 2nr — r^ + 1, problem (11.11) recovers 

the matrix M with probability 1. 



As we will see in Section SI this theorem allows the comparison of rank minimization 
to the theoretical and empirical results of nuclear- norm minimization in the Gaussian 
setting. 
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We prove these results in the next section. In Section H] we discuss the tightness of 
these bounds and compare them with results for nuclear-norm minimization. 

3. General Results and Proofs 

On our way to proving our main results, Theorems 12.11 and 12.21 we will prove a 
more general result about arbitrary manifolds of real matrices. This result can be 
extended even further by considering manifolds over more arbitrary Banach spaces 
and following our proof. For convenience we will restrict ourselves to the Banach 
space of real matrices. Below, a continuously different iable manifold is a manifold 
that may be equipped with a class of atlases having transition maps which are all 
C^-diffeomorphisms. 

Theorem 3.1. Let TZ be a d- dimensional continuously differentiable manifold over 
the set of n X n real matrices. Suppose we take m > d + 1 measurements of the 
form {Ai,X) for X ^ TZ, and define the operator A : TZ ^ which takes these 
measurements, A: X ^ y with y-i = {Ai,X). Assume that there exists a constant 
C = C{n) such that f{\{Ai,X)\ < e) < Ce for every X with \\X\\p = 1. Further 
assume that for each X ^ that the random variables {{Ai,X)} are independent. 
Then with probability 1, 

Null(^) n 7^\{0} = 0. 

Remarks. 

1. The requirement that P(|(Aj,X)| < e) < Ce says that the densities of {Ai,X) 
do not spike at the origin. A sufficient condition for this to hold for every X with 

= 1 is that each Ai has i.i.d. entries with continuous density. 

2. The requirement m > d + l is tight in the sense that the result does not generally 
hold for m < d. For example, take TZ to be the intersection of any ((i+l)-dimensional 
linear subspace of W^^"^ with the unit sphere. Then it is not hard to show that 
Null(v4) n 7^\{0} ^ for any linear operator A : M'''"' M"" as long as m < d. 

To prove this result we will utilize a well-known fact about covering numbers. For a 
set B, norm || ■ || and value e, we denote by N{B, || ■ ||, £) the smallest number of balls 
(with respect to the norm || ■ ||) of radius e whose union contains B. This number is 
called a covering number, and the set of balls covering the space (or more precisely 
the center of these balls) is called an e-net. A bound on the covering number for 
the unit ball under the Euclidean norm || ■ II2 is now well known (see e.g. Ch. 13 
of [16]): 

Lemma 3.2. For any 1 > e > 0, we have 

N{BU\-h,e)<{^^\ 

We are now prepared to prove Theorem 13. 1[ 
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Proof of Theorem \3.1[ For simplicity we take m = d + 1. Since 7^ is a continuously 
differentiable manifold, so is TZ\{0} and this implies that there are a countable 
number of closec|3 sets Vj C 7^\{0} such that 

• Uv. = 7^\{o} 

• For each Vj, there exists a C^-diffeomorphism 0j : Vj — t- -Bf- In words, there 
is a homeomorphism 0j from Vj to the unit Euclidean ball in M.'^ (denoted 
B2) such that 0j and are continuously differentiable. 

Our strategy is to show that for fixed i, ^ ^(Vj) with probability 1. We will then 
apply a union bound using the fact that there are only countably many Vj. 

Fix an i, and for convenience set (p = (pi and V = Vj. Since (f)~^ is continuously 
differentiable, it is Lipschitz on the closed set i?2- Thus there is an L > such that 

(3.1) - r\y)\\F < L\\x - y\\2. 

Next, let B2 be an (£:/L)-net for B2 with cardinality at most {^)'^- This is of course 
possible by Lemma [3.2[ Then the net V defined by V = 0~^(i?2) is an e-net for V. 
Indeed, for any X G V, we have 4>{X) G i?2 and so there is a 6 G -B^ such that 

\\b-HX)h<^. 

By (13. ip we then have 

U-\b) - < L • ||6 - 0(X)||2 <L-^ = e. 

Since 4>~^(b) G 0~^(-Bf), this shows that V is an e-net for V. 

Using the fact that V is an e-net for V, we have that for any X G V, there is an 
X G V such that ||X — X||i? < e. This then implies 

\\A{X)\\^>\\A(X)\U-\\A{X-X)\\^ 

> \\A(X)\\^-\\A\\f^^\\X-X\\f 

> M(x)|U-£- MIIf^oo, 



where || ■ ||F-i>oo denotes the operator norm from the Frobenius norm to the supremum 
norm, || ■ ||oo- Optimizing over all X G V and X G V yields 

inf M(X)||oo > min \\A(X)\\^ - e ■ \\A\\f-^oo- 



Note that in general these sets are open, but by writing each Vi as a countable union of closed 
sets (for example Vi = Uj=i...oo4>~^{{x ■ W^IU < 1 — Vj})) observe that we can choose them to 
be closed. 



We can then bound the probabihty (over the random choice of A) by: 
P [mfJI^WIU = < P (M\\AiX)\U < elogil/s] 



< P min ll^(X) lloo - £ ■ MIIf^oo < e \og{l/e) . 
\x&v J 

Conditioning on whether ||^||f->oo > log(l/£) and using the law of total probability 
yields 



P min M(X)||oo - e ■ \\A\\f^oo < e\og{lls] 

(3.2) < P (^min ||^(X)|U < 2£log(l/£)^ +p(MI|f^oo > Ml/e)). 

Clearly, for e small, the second term in this last line of (13. 2p is neglible. Thus it 
remains to bound the first term. Letting Zi, . . . ,Zm be the coordinates of A{X) for 
a given X G V, we have: 

pfminM(X)|U<2£log(l/£)) < \V\ ■ ¥ {\\A(X)\\^ < 2e\og{l/e)) 

= |V|-P(max{|zi|,...,|2;^|} <2elog(l/e)) 

— ) ■ll(¥{\z.\<2e\og{l/e))), 



where in the last line we have utilized the independence of all Zi = {Ai,X) and the 
size of the net V. 



Now, 



¥i\z,\ < 25log(l/£)) =P(|(A,X)| < 2£log(l/£)) 



p 



Ai, 



X 



X 



< 



2elog(l/e) 
\\X\\p 



Since V is closed and does not contain zero, the Frobenius norm of any X G V 
is bounded uniformly away from zero. This combined with the assumption that 
F{\{Ai,X)\ <e) <Ce for every X with \\X\\f = 1 yields 



(f 



• n < 2elog(l/e))) < (^^j ■ {4C'e\og{l/e)y 



CV'"-'^-(log(l/e))' 
C"e-{\og{l/e)r, 
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where C, C and C" are constants which do not depend on e. The last equahty 
follows since m = d + 1. Taking e to zero once again makes this last term vanish. 
Thus the probability that the null space of A intersects V is zero. Since there are 
only countably many Vj, the probability that the null space of A intersects any of 
these sets is also zero. 

□ 



Now we turn to proving our main result Theorem 12.11 To prove this theorem, it will 
be useful to view the space of rank-2r unit norm matrices as a smooth manifold. 
Then Theorem 12.11 will follow as a corollary of Theorem 13.11 We denote by || ■ 
the usual Frobenius norm for matrices. 

Lemma 3.3. The space of rank-r matrices with fixed Frobenius norm, 

n = {X e M"^" : rank(X) = 2r, ||X||^ = 1}, 
is a smooth manifold with dimension 4nr — 4r^ — 1 . 



Proof. We will first show that the space of rank-2r matrices (with arbitrary Frobe- 
nius norm) is a smooth manifold. This is a well-known result but we sketch the 
proof. Then we will show that the intersection of this space and the sphere of all 
unit norm matrices is transverse which will yield the desired result. To this end, let 
71' be as in (12.11) and let M = {A e W^"'} be the set of all nxn matrices. Let Q be 
the lie group consisting of the cross product of the general linear group with itself: 

g = GL{n,R) X GL{n,R). 



For an element {g,h) G Q, let it act on elements A G 7^' by {g,h)A = gAh^^. Since 
this is just matrix multiplication, this action is clearly continuous. Moreover, it is 
transitive. Indeed, let A,B eTZ' Since A is rank 2r, there are g,h E GL{n, M) such 
that 

' hr 



gAh~ 



where l2r denotes the 2r x 2r identity matrix and 0„_2r denotes the n — 2rxn — 2r 
matrix of zeros. Similarly, there are y,z E GL{n,M.) such that yBz~^ equals this 
same block matrix. Thus gAh~^ = yBz~^ and so A = g~^yBz~^h which proves 
transitivity since {g~^y, z~^h) G Q. Next let "H be the stabilizer of the matrix 
D under the action of Q. Since the action of Q is continuous and transitive, the 
stabilizer "H is a closed subgroup of the lie group Q and thus 7/ is a closed lie 
subgroup. Therefore Q /H is a smooth manifold. But "H is the stabilizer of D under 
Q and so since the action is transitive, Q /H must be isomorphic to the orbit of D 
under the action of Q. But the orbit of D is precisely our set TZ' and thus TZ' is also 
a smooth manifold. 
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We now wish to show that TZ is also a smooth manifold by viewing TZ as the inter- 
section of TZ' and the sphere 

S = {AeM: \\A\\f = 1}. 

It is clear that iS is a smooth manifold (it is a sphere of the smooth manifold Ai) 
and by above TZ' is also a smooth manifold. Since TZ = TZ'dS and both TZ' and S are 
smooth manifolds, to show that TZ is also a smooth manifold it suffices to show that 
this intersection is transverse. That is, we need to show that for any A & TZ' Ci S, 
the direct sum of the tangent space of iS at A and TZ' at A is equal to the tangent 
space of Ai of A: 

Ta{S)®Ta{TZ')=Ta{M). 

Since S has codimension 1 in A^, it will suffice to show that Ta(TZ') contains a 
vector in the direction normal to the sphere. Now for any A E TZ' and k ^ 0, the 
matrix kA will clearly also have rank 2r and thus be contained in TZ'. Therefore 
TZ' contains the entire line L through the origin containing A (excluding the actual 
origin itself). Then since L C TZ', we have L = Ta{L) C Ta{TZ'). Thus we must 
indeed have Ta{S) © Ta(TZ') = T^(A^). Therefore the intersection 7^ = 5 fl 7^' is 
transverse and so it is a smooth manifold. 

Finally, it is well known that the dimension of the manifold TZ' is 4nr — 4r^ (see |14[ 
Chapter 8]) and the codimension of 5 in is 1. Thus codim(7^) = codim(5) + 
codim(7^') = — (4nr — 4r^) + 1 and so the dimension of TZ is 4nr — 4r^ — 1. 

□ 

We finally show that Theorem 12.11 and Theorem 12.21 follow as corollaries. 

Proof of Theorem \2.1[ By Lemma 13. 3[ 7?. is a smooth manifold of dimension d = 
4nr — 4r^ — 1 and note that clearly TZ = TZ\{0}. Let A be the operator taking 
m > Anr — Ar"^ Gaussian measurements {Ai, X) for X eTZ and Ai (for i = 1,2, . . . m) 
having i.i.d. Gaussian entries. Then all {Ai,X) are independent and have (the 
same) continuous density. Therefore by Theorem 13. Null(^) fl 7?. = 0. Applying 
Theorem 13.11 for all ranks between 1 and 2r, we see that there is no matrix of rank 
2r or less in the null space of A. Thus when M has rank r (or less) there can be 
no other matrix X with A{X) = A{M) having the same or lower rank. This proves 
that (11. ip must recover the matrix M and completes the proof. □ 

Proof of Theorem \2.2 . Let W = {X — M : rank(X) = r}. Note the proof of 
Lemma 13.31 explicitly shows that the space of all matrices of a fixed rank r is a 
smooth manifold of dimension 2nr — r^. Since W is a shift of this space, it is also a 
smooth manifold of the same dimension. Then by Theorem 13.11 we have that with 
probability one 

>V\{0} n Null(^) = 0. 
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Repeating this for ranks 1 through r, we get that with probabihty one 
(3.3) >V'\{0} n Null(^) = 

where W" = {X — M : rank(X) < r}. Now let X be the solution of the rank 
minimization problem f ll.ip . Since M has rank r and is a feasible matrix, rank(X) < 
r as well. Thus X -M eW. But since A{X) = A{M), X -M e Null(A). Thus 
by (13.31) it must be that X — M = which shows X = M is the recovered matrix. □ 

4. Discussion 

The bounds on the number of measurements given by Theorems 12.11 and 12.21 of 
4nr — 4r^ and 2nr — + 1 are analagous to the bounds of 2s and s + 1 in compressed 
sensing. As we did in the compressed sensing case, it is of course insightful to com- 
pare rank minimization and nuclear-norm minimization. To (provably) recover nxn 
rank-r matrices using nuclear-norm minimization, one needs Cnr measurements. As 
discussed in [3] , by observing that the space of rank-r matrices has a subspace which 
consists of all rank-r matrices whose last n — r rows are zero, one sees that at least 
2nr measurements are required to recover all rank-r matrices. In [21] explicit for- 
mulas and graphs are given from which bounds on the constant C can be derived. 
Even more recent results in [22] prove that 6nr measurements suffice for weak re- 
covery and 16nr measurements suffice for strong recovery. New work in [231 121] also 
shows weak recovery when m > 6nr — 3r^. In addition, numerical results indicate 
that weak recovery requires about 4nr — 2r^ Gaussian measurements [211 Figure 
1]. Thus according to these results, rank minimization does succeed with somewhat 
fewer measurements. We emphasize that this should not be a surprise — nuclear- 
norm minimization is a tractible method whereas rank minimization is an intractible 
method whose guarantees give us theoretical bounds with which to compare. In fact, 
the price to pay for a tractible method in low-rank matrix recovery seems to be a 
very reasonable one. 

As discussed above, our general manifold result. Theorem 13.11 is tight. However, 
this does not imply that its consequences. Theorems 12.11 and 12. 2^ are tight since the 
set of matrices of fixed rank is not a linear subspace. We conjecture that the strong 
recovery requirement, m > 4nr — 4r^ from Theorem 12. H is tight because the number 
of measurements required matches the dimension of the underlying manifold. In the 
case of the weak recovery requirement m > 2nr — 2r^ -|- 1 given by Theorem 12. 2[ 
we require m to be one greater than the dimension of the underlying manifold. 
However, we once again conjecture this to be tight at least within an additive factor 
of one for the same reason. 

Our results in conjunction with work on nuclear-norm minimization show how close 
nuclear-norm minimization guarantees are to those of the intractible problem of rank 
minimization. While rank minimization requires fewer measurements, it is not at all 
an unreasonable amount to pay in order to solve the problem via a computationally 
feasible method. 
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